# Unsupervised pre-training
Dinov2 With Registers Base
Apache-2.0
A vision Transformer model trained with DINOv2, optimized with register tokens to enhance attention mechanisms and improve feature extraction capabilities
Image Classification
Transformers

D
facebook
22.74k
5
RADIO
A visual feature extraction model developed by NVIDIA that converts images into embedding vectors for downstream tasks

R
nvidia
5,166
36
Dinov2 Large
Apache-2.0
A vision Transformer model trained using the DINOv2 method, extracting robust visual features from massive image data through self-supervised learning
Image Classification
Transformers

D
facebook
558.78k
79
Wav2vec2 Nsc Final 1 Google Colab
Speech processing model based on the wav2vec2 architecture, training details not fully disclosed
Speech Recognition
Transformers

W
YuanWellspring
99
0
Wav2vec2 Base 10k Voxpopuli Ft En
A Wav2Vec2 base model pre-trained on a 10K unlabeled subset of the VoxPopuli corpus and fine-tuned on English transcription data, suitable for English speech recognition tasks.
Speech Recognition
Transformers English

W
facebook
40
1
Wav2vec2 Large Slavic Voxpopuli V2
Facebook's Wav2Vec2 large model, pre-trained on 88.99999999999999 hours of unlabeled data from the Slavic language VoxPopuli corpus.
Speech Recognition
Transformers

W
facebook
26
0
Wav2vec2 Large Baltic Voxpopuli V2
Facebook's Wav2Vec2 large model, pre-trained on 27.5 hours of unlabeled data from the Baltic language subset of the VoxPopuli corpus.
Speech Recognition
Transformers

W
facebook
25
0
Wav2vec2 Base Es Voxpopuli
Wav2Vec2 speech recognition base model pre-trained on unlabeled Spanish data from VoxPopuli
Speech Recognition
Transformers Spanish

W
facebook
39
2
Mt5 Large
Apache-2.0
mT5 is a multilingual text-to-text transfer model introduced by Google, supporting 101 languages and pre-trained on the mC4 dataset.
Large Language Model Supports Multiple Languages
M
google
404.82k
90
Wav2vec2 Large Es Voxpopuli
Large-scale speech pre-training model trained on the Spanish subset of the VoxPopuli corpus, suitable for Spanish speech recognition tasks
Speech Recognition Spanish
W
facebook
117.04k
1
Wav2vec2 Base Sv Voxpopuli V2
A speech model based on Facebook's Wav2Vec2 architecture, specifically pre-trained for Swedish using 16.3k hours of unlabeled data from the VoxPopuli corpus.
Speech Recognition
Transformers Other

W
facebook
30
0
Wav2vec2 Base Fi Voxpopuli V2
A speech model based on Facebook's Wav2Vec2 architecture, specifically pre-trained for Finnish, suitable for speech recognition tasks.
Speech Recognition
Transformers Other

W
facebook
29
1
T5 Large Lm Adapt
Apache-2.0
The LM adapted version of T5 Version 1.1 is an improved text generation model based on the T5 architecture, further trained with language modeling objectives to enhance prompt tuning capabilities.
Large Language Model
Transformers English

T
google
501
8
Wav2vec2 Large Uralic Voxpopuli V2
Wav2Vec2 large speech model pre-trained on 42.5 hours of unannotated Uralic language data from the VoxPopuli corpus
Speech Recognition
Transformers

W
facebook
46
0
Wav2vec2 Base Da Voxpopuli V2
A speech model based on Facebook's Wav2Vec2 architecture, specifically pre-trained for Danish using 13.6k unlabeled data from the VoxPopuli corpus.
Speech Recognition
Transformers Other

W
facebook
35
0
Wav2vec2 Base Fr Voxpopuli V2
Facebook's Wav2Vec2 base model, pre-trained exclusively on French using 22.8k unlabeled data from the VoxPopuli corpus.
Speech Recognition
Transformers French

W
facebook
103
1
Wav2vec2 Large 100k Voxpopuli
A speech recognition model pre-trained on 100,000 hours of unlabeled data from the VoxPopuli corpus, supporting multilingual speech representation learning
Speech Recognition Other
W
facebook
2,218
4
Wav2vec2 Large North Germanic Voxpopuli V2
Large speech model pre-trained on North Germanic language corpus from VoxPopuli
Speech Recognition
Transformers

W
facebook
25
0
Featured Recommended AI Models